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Abstract. In this paper wo enumerate fc-noncrossing RNA pseudoknot structures with given 
minimum stack-length. We show that the numbers of fc-noncrossing structures without isolated 
base pairs are significantly smaller than the number of all fc-noncrossing structures. In particular 
we prove that the number of 3- and 4-noncrossing RNA structures with stack-length > 2 is for 
large n given by 311.2470 „(„„i)'*',(„_4) 2.5881" and 1.217 ■ lO'^n" ^ 3.0382", respectively. We 
furthermore show that for fc-noncrossing RNA structures the drop in exponential growth rates 
between the number of all structures and the number of all structures with stack-size > 2 
increases significantly. Our results are of importance for prediction algorithms for pseudoknot- 
RNA and provide evidence that there exist neutral networks of RNA pseudoknot structures. 



1. Introduction 



An RNA structure is the helical configuration of an RNA sequence, described by its primary 
sequence of nucleotides A, G, U and C together with the Watson-Crick (A-U, G-C) and (U-G) 
base pairing rules. Subject to these single stranded RNA form helical structures. Since the function 
of many RNA sequences is oftentimes tantamount to their structures, it is of central importance to 
understand RNA structure in the context of studying the function of biological RNA, as well as in 
the design process of artificial RNA. In the following we use a coarse grained notion of structure by 
concentrating on the pairs of nucleotide positions corresponding to the chemical bonds and ignoring 
any spatial embedding. There are several ways to represent these RNA structures [lU [31]. We 
choose the diagram representation [23j which is particularly well suited for displaying the crossings 
of the Watson-Crick base pairs. A diagram is a labeled graph over the vertex set [n] = {1, . . . , 71} 
with degree < 1, represented by drawing its vertices 1, ... ,n in a horizontal line and its arcs 
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where i < j, in the upper half-plane. The vertices and arcs correspond to nucleotides and Watson- 
Crick (A-U, G-C) and (U-G) base pairs, respectively. We categorize diagrams according to the 
3 parameters (fc, A,(t): the maximum number of mutually crossing arcs, fc — 1, the minimum arc- 
length, A and the minimum stack-length, a. Here the length of an arc is j — i and a stack of 
length cr is a sequence of "parallel" arcs of the form ((i, j), (i-f 1, j — 1), . . . , (ct — 1), j — (ct — 1))), 
see Figure [1] Wc call an arc of length A a A-arc. 




1 2 3 4 5 6 7 8 




1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 



Figure 1. fc-noncrossing diagrams: in the the upper diagram the arcs red/ blue/green 
mutually cross, the arc with minimum length 3 is (3,6) and the arc (1,5) is isolated. 
Hence this is a 4-noncrossing, A = 3, cr = 1 diagram without isolated vertices. Analo- 
gously, below we have a 3-noncrossing (no red/green cross), A = 4, a = 2 diagram with 
isolated vertices 3, 13. 

In the following we call a fc-noncrossing diagram with arc-length > 2 and stack-length > cr a fc- 
noncrossing RNA structure (of type (fc,cr)). We denote the set (number) of fc-noncrossing RNA 
structures of type (fc, a) by Tk^in) (T^, ^^(fT.)) and refer to fc-noncrossing RNA structures for fc > 3 
as pseudoknot RNA structures. Intuitively, a higher number of pairwise crossing arcs is tantamount 
to higher structural complexity and crossing bonds are reality |1D]. These pseudoknot bonds [32] 
occur in functional RNA (RNAseP [Hj), ribosomal RNA [T3] and are conserved in the catalytic 
core of group I introns, see Figure [21 where we show the diagram representation of the catalytic 
core region of the group I self-splicing intron For fc = 2 we have RNA structures with no 2 
crossing arcs, i.e. the well-known RNA secondary structures, whose combinatorics was pioneered 
by Waterman et.al. [T7| , [28| , [29 [ [3 H [30]. RNA secondary structures arc structures of type (2, 1). 
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Figure 2. Diagram representation of the catalytic core region of the group I self- 
splicing intron [3]. Six tertiary interactions shown in green lines. The gaps after 
G54, U72, G103 and A112 indicate that some nucleotides are omitted which are involved 
in an unrelated structural motif. 

There are many reasons why pseudoknot structures are fascinating. First, compared to secondary 
structures their "mathematical" properties are much more intriguing [101 111! I12| . Their enumera- 
tion employs the nontrivial concepts of vacillating tableaux [H [S] and singular expansions [TTl [T^] . 
Secondly, the recurrence relation for the numbers of 3-noncrossing RNA [10] is, in contrast to that 
for secondary structures, "enumerative" but not "constructive". This indicates that prediction of 
pseudoknot RNA is much more involved compared to the dynamic programming routine used for 
secondary structures. Nevertheless, there exist several prediction algorithms for RNA pseudoknot 
structures [111 [HI [TJ [T^ which are able to express certain "types" of pseudoknots. In this con- 
text the notion of the "language of RNA" has been tossed [27|. The combinatorial analysis in 
[TUlfTTlfT^ shows that 3-noncrossing RNA structures (Ta exhibit an exponential growth rate 
of 4.7913 and even when considering only structures with minimum arc-length 3 the rate 

is 4.5492. This is bad news, since this rate exceeds already for fc = 3 the number of sequences 
over the natural alphabet. Therefore, a priori, not all 3-noncrossing structures can be folded by 
sequences. The situation becomes worse for higher k: the results of [HI [T2j imply the following 
exponential growth rates for fc-noncrossing RNA structuref0 



here yt,! denotes the dominant real singularity of the generating function 
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k 


2 


3 


4 


5 


6 


7 


8 


9 


10 




2.6180 


4.7913 


6.8540 


8.8873 


10.9087 


12.9232 


14.9321 


16.9405 


18.9466 



Can we identify and analyze those /c-noncrossing structures that do "occur"? To this end. let us 
consider this question in the biophysical context: RNA structures arc formed by Watson-Crick (A- 
U, G-C) and (U-G) base pairs and, due to the specific chemistry of the latter, parallel bonds are 
thermodynamically more stable. This fact is well-known and has led to the notion of "canonical" 
structures [M], i.e. structures in which there exist no isolated base pair, see Figure O The question 




1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 



Figure 3. A canonical structure 

then is, do canonical fc-noncrossing structures exhibit significantly smaller growth rates? Why this 
(to our knowledge) has not been seriously pursued could be explained by a result due to Schuster 
et.al. [B], who have proved the following: the number of RNA secondary structures, T2,i(n), 
exhibits an exponential growth rate of = 2.6180 while the number of canonical RNA secondary 
structures T2,2{n) has an exponential growth rate of 7^2 = 1-9680. In other words, the exponential 
growth rate drops less than 25% when passing from arbitrary to canonical secondary structures. 
We remark that Schuster's enumerative result is of central importance, since the growth rate of 
canonical secondary structures implies the existence of a "many to one" sequence to structure 
mapping. This has, subsequently, led to the concept of neutral networks [18| . 

In the following we will develop a novel combinatorial framework which allows to enumerate any 
RNA structure class of type {k, a), for any k, a. We then can report good news: there is indeed a 
significant drop in the exponential growth rates when passing from fc-noncrossing RNA structures 
to their canonical counterparts for fc > 3. Explicitly we can give the following data 
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k 


2 


3 


4 


5 


6 


7 


8 


9 


10 




2.6180 


4.7913 


6.8540 


8.8873 


10.9087 


12.9232 


14.9321 


16.9405 


18.9466 




1.9680 


2.5881 


3.0382 


3.4138 


3.7439 


4.0420 


4.3159 


4.5714 


4.8114 



where the case fc = 2 is due to [6], which is independently confirmed by our approach. In particular, 
for 3-noncrossing RNA structures, we have a drop in exponential growth rate from 4.7913 to 2.5881, 
more than 46% and for A; = 10 there is a drop of more than 74%. As a result, the number of 
canonical 3-noncrossing RNA structures is very close to that of arbitrary secondary structures. 
Intuitively this makes perfect sense since canonicity implies parallel arcs which limits severely 
crossings and it can be expected to have dramatic effect on fc-noncrossing RNA for large k. In 
other words, the biophysical constraints (thermodynamic stability) counteracts the combinatorial 
variety, see Figured] 




GCCCCHAACUCCUAAGAG UCACCAC GUGeUCGUAUGAG GCC 



Figure 4. Biophysical constraints inducing parallel arcs: the hammerhead ribozyme 
[2]. Its two tertiary interactions are shown in green lines. The gap after C25 indicates 
that some nucleotides are omitted, which are involved in an unrelated structural motif. 

The main idea in this paper is to consider a new type of fc-noncrossing structure, that can be consid- 
ered as being "dual" to canonical structures. We consider fc-noncrossing structures in which there 
exists no two arcs of the form (i, j), (i -f 1, j — 1). These structures are called fc-noncrossing core- 
structures and C^(n) denotes their number. The key observation with respect to core-structures 
is the following: any structure has a unique core obtained by identifying all arcs contained in 
stacks by a single arc and keeping isolated vertices. Furthermore, the number of all structures is 
a sum of the numbers of the corresponding core structures with positive integer coefficients. In 
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Figure O we illustrate the idea of how a core-structure is obtained. It is of particular interest to 




123456789 10 1 2 3 4 



Figure 5. Core-structures. Each sequence of stacked arcs in the 3-noncrossing (canoni- 
cal) structure (Ihs) is replaced by its unique arc with minimal length (rhs). The so derived 
core-structure is unique. We show in Lemma [2] that is assignment yields a well-defined 
mapping (i.e. no arcs of the form + 1) are being produced). 



note that Figure [5] shows that deriving the core-structure can reduce the minimum arc-length, but 
cannot produce arcs of the form (i, i + In Theorem[3]we derive the generating function for core- 
structures which shows that "most" /c-noncrossing structures are in fact core-structures. Denoting 
the exponential growth rate of fc-noncrossing core-structures by k^^ we have the situation 
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10 




2.6180 


4.7913 


6.8540 


8.8873 


10.9087 


12.9232 


14.9321 


16.9405 


18.9466 




2.5152 


4.7097 


6.7921 


8.8378 


10.8672 


12.8866 


14.9031 


16.9119 


18.9215 



In Theorem |4] we derive a functional identity for the generating function for /s-noncrossing RNA 
structures with stack-length > cr, which allows to obtain exact and asymptotic results on T^, ^{n), 
i.e. all fc-noncrossing RNA structures with stack-length > a. In its proof the number of k- 
noncrossing core-structures plays a central role. As for the quality of the asymptotic expressions 
we compare in the table below subcxponential factors for 3-and 4-noncrossing RNA structures with 
stack-length > 2, t^^^in) = „(„„i)("j2Kr3)(„,4) ^nd U^2{n) = 1.217 • lO^n-* with TsAn)!^ 
and T4. 2(71)7421 respectively. Here 7^7^ denotes the respective exponential growth rate: 
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The subexponential factor 


n 


T3,2(n)75'^2 




T4,2(n)7^_2 




50 


1.214 X 10"^ 


2.938 X 10"^ 


3.115 X 10"* 


1.763 X lO"'' 


60 


5.498 X 10"'' 


1.140 X 10"^ 


6.884 X 10"^ 


2.599 X 10"* 


70 


2.776 X 10"'' 


5.143 X 10"'^ 


1.841 X 10"" 


5.151 X lO"'' 


80 


1.522 X 10"'' 


2.589 X 10"^ 


5.708 X 10"^° 


1.268 X lO"'' 


90 


8.905 X lO"'^ 


1.416 X 10"^ 


1.991 X 10"^° 


3.680 X 10"" 


100 


5.487 X lO"'^ 


8.268 X 10"^ 


7.650 X 10"" 


1.217 X 10"^° 



2. Some basic facts 

In this Section we provide the basic facts needed for proving Theorem [3] in Section [3] and Theorem 
|4]in SectionlH For background on crossings and nestings in diagrams and partitions we recommend 
the paper of Chen et.al. |4] and for the analytic combinatorics and asymptotic analysis the book 
of Flajolet 0. Our results are based on the generating function of /c-noncrossing RNA structures 
[TU] . and asymptotic analysis of fc-noncrossing RNA structures [Ullllj, summarized in Theorem [T] 
below. 

Let us first recall our basic terminology, by Tk,a{n) we denote the set of fc-noncrossing RNA 
structures with minimum stack length a and let cr(n) denote their number. That is Tk.a{n) can 
be identified with the set of diagrams with degree < 1, represented by drawing the vertices 1, . . . , n 
in a horizontal line and its arcs (i,.?), where i < j, in the upper half plane with arc-length > 2 and 
stack-length > a, in which the maximum number of mutually crossing arcs is fc'— 1. Furthermore let 
Tk,a{n, h) denote the set of fc-noncrossing RNA structures stack-length > a having h arcs and let 
Tk,a{n, h) denote their number. We denote by /^(n, £) the number of fc-noncrossing diagrams with 
arbitrary arc-length and £ isolated points. In Figure [6] we display the various types of diagrams 
involved. 
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1 2 3 4 5 6 7 




1 2 3 4 5 6 7 



Figure 6. Basic diagram types: (a) perfect matching (/3(8,0)), (b) partial matching 
with 1-arc (4, 5) and isolated points 6, 8 (/3(8, 2)), (c) structure (i.e. minimum arc-length 
> 2) with minimum stack-length 2 and no isolated points (T3,2(8)) and (d) structure 
with minimum stack-length 3 and isolated points 4,8 (T2,3(8)). 



The following identities are due to Grabiner et. al. [9] 

(2.1) ^/fc(n,0)-— = det[/,_j(2x)-/,+j(2a 



n>0 



ifc-l 



(2.2) 



where Iri'ix) = J2 



n>0 (.£=0 

2j + r 



e-dct[/,_,(2x)-/.+,(2x)]|^-i 



X 



denotes the hyperboUc Bessel function of the first kind of order 



^j>o ji.{r+jy. 

r. Eq. (j2.ip and (j2.2p allow "in principle" for explicit computation of the numbers fk{n,€). In 
particular for k ~ 2 and fc = 3 we have the formulas 



(2.3) 



f2inj) = ( J C(„_£)/2 and hinj) 



where Cm denotes the m-th Catalan number. The second formula results from a determinant 
formula enumerating pairs of nonintersecting Dyck-paths. In view of 



fk{n-i,0) 



everything can be reduced to perfect matchings, where we have the following situation: there 
exists an asymptotic approximation of the hyperbolic Bessel function due to [16j and employing 
the subtraction of singularities-principle [16j one can prove 



(2.4) 



VfceN; fk{2n,0) ^ ipkin) 
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where pk is the dominant real singularity of X]n>o fki'2n,0)z"- and (pfc(n) is a polynomial over n. 
Via Hadamard's formula, pk can be expressed as 

(2.5) Pk = lim iM2n,0)y^ . 

n — ^oo 

Eq. (|2.4p allows for any k to obtain (pk{n), explicitly. 

As for the generating function and asymptotics of /c-noncrossing RNA structures we have the 
following result 

Theorem 1. |10l 111] Let k Cz N, k > 2. Then the number of k-noncrossing RNA structures with 
^^^) arcs, Tfe i(n, ■^^^) and the number of k-noncrossing RNA structures, i(n) are given by 

(2.6) TM(n,^) = 5:(-l)''r ' A(r.-26,^) 

(2.7) T,,i(n) = E(-1)T\ E^("-2&.^)^ 

6=0 ^ ^ I i=0 ) 

where {X]"=o^'' /^('^ ^ 25, £)} is given via eq. ()2.2p and furthermore 

10.4724.4! /5 + V2T\ 



T3,i("-) 



n(n - 1) ... (n - 4) 



The following functional identity is due to and relates the bivariate generating function for 
Tfc_i (n,/i), the number of RNA pseudoknot structures with h arcs to the generating function of 
/c-noncrossing perfect matchings. 

Lemma 1. Let fc G N, > 2 and z,u be indeterminants over C. Then we have 

(2.8) ^ ^ T„(„,M.»«-^;;.pr^EA(2n,0) ^^,5^ , 

n>0h<n/2 n>0 ^ ^ 

Ln particular we have for u ^ 1, 

(2.9) Y.^uAn) z" = -^-1^ ^/,(2n,0) 

n>0 n>0 



z2 - Z + 1 
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In view of Lemma [T] it is of interest to deduce relations between the coefEicients from the equahty 
of generating functions. The class of theorems that deal with this deduction are called transfer- 
theorems [7]. One key ingredient in this framework is a specific domain in which the functions in 
question are analytic, which is "slightly" bigger than their respective radius of convergence. It is 
tailored for extracting the coefficients via Cauchy's integral formula. Details on the method can be 
found in [7] and its application to 3-noncrossing RNA in [llj . To be precise the domain in question 
is 

Definition 1. Given two numbers (/), R, where R > 1 and < (f> < ^ and p E M. the open domain 
Ap(0, R) is defined as 

(2.10) Ap(^, R) = {z\ \z\ <R,z^p, |Arg(z - p)\ > 

A domain is a Ap-domain if it is of the form Ap((/), R) for some R and (j). A function is Ap-analytic 
if it is analytic in some Ap-domain. 

We use the notation 

(2.11) {f{z) = O {g{z)) as z ^ p) <;=^ [f{z)/g[z) is bounded as z p) 

and if we write f{z) — 0{g{z)) it is implicitly assumed that z tends to a (unique) singularity, 
[z"] f{z) denotes the coefficient of z" in the power series expansion of J{z) around 0. 

Theorem 2. [5] Let f{z), g{z) be a Ap-analytic functions given by power series f{z) = X]n>o '^nZ^ 
and g{z) — X]ri>o ^nZ^ ■ Suppose f(z) = 0(g{z)) for all z G Ap and fe„ ^ ip{n){p^^)" , where ip{n) 
is a polynomial over n. Then 

(2.12) a„ - [z"] f{z) ^ K [z"].g(z) = Kb^ ^ K ^{n){p-^r 
for some constant K . 

Transfer theorems are accordingly a translation of error terms from functions to coefficients and 
guaranteed when the functions in question are analytic in some Ap-domain. 

3. Core-structures 

As discussed in the Introduction, a core-structure is a /c-noncrossing structure with no stacked base 
pairs. We denote the set and number of core-structures over [n] by Ck{n) and Cfe(n), respectively. 
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Analogously Ck{n, h) and Cfc(n, h) denote the set and the number of core-structures havmg h arcs. 
In Lemma [2] below we establish that the number of all fc-noncrossing structures with stack-length 
> tr is a sum of the numbers of fc-noncrossing cores with positive integer coefficients. 

Lemma 2. (Core-lemma) For k,h,(T £ N, k > 2, 1 < h < n/2 we have 
(3.1) T,,.(n, h) = E ^ ' ') ^-^^ - - ^) ■ 

Remark 1. Lemma [5] cannot be used in order to enumerate diagrams with arc-length > A, where 
A > 2 and stack-length a. Basically, fc-noncrossing structures with arc-length > A have core- 
structures with arc-length 2, see Figure [71 The enumeration of fc-noncrossing RNA structures with 
arc- length > 3 and stack-length > 2 is work in progress. 




i-1 i i+4 i+5 i+6 j i i+4 i+5 j 



distance=5 <\ |-«— distance=2— ^ 

Figure 7. Core-structures will in general have 2-arcs; the structure 5 £ 13,2(19) (Ihs) 
is mapped into its core c(5) (rhs). Clearly 5 has arc-length > 5 and as a consequence of 
the collapse of the stack ((i 4- 1, j + 3), . . . , (i -I- 4, j)) (the blue arcs are being removed) 
into the arc (i-|-4,j), c{S) contains the 2-arc (i,i-|-5). 

Proof. First, there exists a mapping from fc-noncrossing structures with h arcs and minimum stack 
size a over [n] into core-structures: 

(3.2) c:nAn.h)^\\ ^^^^ Ck{n ~ 2h,h - b), 5 ^ c{5) 

where the core-structure c{5) is obtained in two steps: first we map arcs and isolated vertices as 
follows 

(3.3) V^>cr-1; + and j^j if j is isolated. 

and second we relabel the vertices of the resulting diagram from left to right in increasing order. 
That is we replace each stack by a single arc and keep isolated points and then relabel, sec Figure [H 
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123456789 10 11 123578 10 1234567 



Figure 8. The mapping c: Tk^cT(n,h) — > Uo<6<h-i'-^fe('^ — 2b,h — b) is obtained in 
two steps: first contraction of the stacks and secondly relabehng of the resuhing diagram. 

We have to prove that c: T^. 3^(77,, h) — > ljQ^j^^_-|^Cfc(n — 26, h — b) is weh-defined, i.e. that c cannot 
produce 1-arcs. Indeed, since 5 € Tk_a-{n, h), 6 does not contain 1-arcs we can conclude that c{d) has 
by construction arcs of length > 2. c is by construction surjective. Keeping track of multiplicities 
gives rise to the map 
(3.4) 



0<6</i-l 



h-b 



Cf,{n - 2b,h- b) X <^ {aj)i<j<h~b I'^^j =b, aj > cr - 1 



b+{2-<7){h-b)-l 
h-b-1 



given by ^(6) = {c{6), {aj)i<j<h-b)- We can conclude that /j. ^ is well-defined and a bijection. 

We proceed computing the multiplicities of the resulting core-structures: 

Claim. 

h~b 

(3.5) \{{aj)i<j<h~b \ ^ b; a-j > a - 1}| 

i=i 

Clearly, Cj > cr — 1 is equivalent to fij ~ aj — a + 2 > 1 and we have 

h-b h-b 

Y,^l, ^Y.^aj - a + 2) = b+ {2- a)(h-b) . 

We next show that 

h-b 

(3.6) \{{l^3)i<,<h-b}\ Y,iij=b+{2- a){h - b); fi, > 1| 

j=i 

is equal the number of (/i — 6 — l)-subset in {1, 2, . . . , 6 + (2 — cr)(/i — b) — 1}. Consider the set 

(3.7) {/^i,Mi + Ai2, • • • ,Mi + M2 H h ^J-h-b-l} 

consisting oih-b-l distinct elements of [6+ (2 - a-)(/7, - 6) - 1] = {1, 2, . . . , 6+ (2 - <t)(/i - 6) - 1}. 
Therefore {/^i, ^1 4- /^2, • • • , Mi + M2 + ■ • ■ + ^^h-b-l} is a {h — b— l)-subsct of [b + {2 — a){h — b) — 1]. 
Given any (/i — fe — l)-subset of [b+{2 — a){h — b) — 1], we can arrange its elements in linear order and 
retrieve the sequence {/i^ | 1 < i < /i — 6} of positive integers with sum b + {2 — a){h — b) . Therefore 
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the above assignment is a bijection. Since the number of {h — b — l)-subsets of [5+ (2 — a) (h — b) — 1] 
is given by C'"^^^7,!.\^^7^^"^) the Claim follows. 
We can conclude from the Claim and eq. (j3.4|l that 

(3.8) T,>,.)^ g C^^';''f-'^-%in-2b,h-b) 

holds and the lemma follows. □ 



Next, we prove a functional identity between the bivariate generating functions of T^, ^{n, h) and 
Cj,(n, /i). This identity plays a central role in proving Theorem [3] and Theorem |4] in Section [4l 

Lemma 3. Let fc, cr G N, /c > 2 and let u, x be indeterminants. Then we have the functional 
relation 



(3.9) E E T..(", ^)"''--" -EE C.(", h) M^^) 



X 



n>0 h< ^ 

and in particular, for u = 1 



l-x 



(3.10) ^T,,,(n)^"^^^C,K/.)(^:'^'- ' ^ 



n>0 



1-X 



X 



Proof. We set J2n>Q ^h<2^ h)u^x'^ = Z]/i>o 'Phi^)^'^ compute 

(3.11) 

EEt...(-,^)-''-" = EE E Q(n-25,/.-5)('+^'^;^f_-')-^),/."+E 

n>0/i<f n>Oh<^b<h-l ^ i>l 

where the term Y^ -^i = t-^— comes from the fact that for h = the binomial 

b+{2-(T){h-b)-l 
h-b-1 

is zero, while the Ihs counts for any i > 1 the unique structure having only isolated vertices. We 
proceed to compute 

'b+{2-a){h-b)-l\ , 2, , X 

h>0 b<h-l n>2h 



h-b-1 J 1-X 



b>0 b<h 
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Setting m ^ h — b and subsequently interchanging the summation indices we arrive at 



n>0/i<f b>0 l<m 

'u ■ {ux^Y 



1-x 



m>0 



1 — UX^ J 1 ~ X 



X 



1-x 



n>0 h<i_ 

whence Lemma [3l □ 



We next enumerate core-structures. The Theorem has two main parts, the first is the "inversion" 
of Lemma [21 It allows to express core-structures via all structures and follows by Mobius inversion. 
The second part deals with the asymptotics of core-structures. The asymptotic formula follows 
then from transfer Theorems (the super-critical case) [7] applied to some version of the functional 
identity of Lemma [T] 

Theorem 3. (Core-structures) Suppose k N, k > 2, let x be an indeterminant, pk the 
dominant, positive real singularity 0/^^^^^/^.(271,0)2" (eq. i2.5]) ) and Ui{x) ~ ■ Then for 
h>\, the numbers of k-noncrossing core- structures, Ckin) are given by 

(3.12) C,(n, h) = Xi(-l)'"'"' I ^)Tfe.i(" - 2/i + 26 + 2, 6 + 1) . 

h=0 ^ ^ 

Furthermore we have the functional equation 

(3.13) EC.H-" - , ^ E^(2"^0)( f'^T 

^ UiX'^—X+1'^ \uix''—x+lj 1 — x 

n>0 n>0 ^ 

and the asymptotic expression 

(3.14) C,(7i) ^ ^,.(n) (^-1) , 

where Kk is the dominant positive real singularity o/^^^^q C^(n)a;" and the minimal positive real 
solution of the equation ^ ^{^.^^ j ~ Pk md (fikin) is a polynomial over n derived from the asymp- 

-^j of eq. i2.4\ )- 
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n 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


13 


14 


15 




1 


1 


2 


5 


12 


31 


88 


263 


814 


2604 


8575 


28936 


99726 


350151 


1249865 


C4(n) 


1 


1 


2 


5 


12 


32 


95 


301 


1001 


3495 


12708 


47932 


186581 


747619 


3073207 



Proof. We set 

VO<i</i-l; a{i) = - 2(ft. - 1 - i), i + 1) 

VO<i</i-l; b{i) = ^(ri - 2(/i - 1 - i), j + 1) 

We first employ Lemma [2] for cr = 1: 



h-l 



6=0 



h-l 



h-l 



(3.15) 



C,(n-25,/i-&) 

'iZai-^f'^"' which is equivalent to 

C,(n,M = ^(-l)^ 



Via Mobius-inversion we arrive at a{h — 1) = X]i=o 

h-l 



6=0 



h-b-U^ ^ ^)Tfe^i(ji- 2/1 + 26 + 2, 6+1) 



whence eq. p.l2p . We proceed by proving cq. p.l3p . First Lemma [3] implies: 



(3.16) 



^^T,,,(n,/.)u''.T" = ^^C,(r.,/.)(- 



l-x 



n>Oh<^ n>Oh<^ 

and we inspect that ui{x) ~ is the unique solution for ^J^-^-i = 1- Accordingly we obtain 
^^T,,i(n,/.)«^x" = ^C,(n)x" + ^. 

?i>0 /!<.§■ n>0 

Secondly, setting u = y^ui, Lemma [T] provides an interpretation of the Ihs of eq. p.l6p : 



n>0h<7i/2 n>0 ^ 



X — X + 1 



and we can conclude 

ECfc(n)a:" = ^ ^ T,_^(n, /z)4a:" - - 



n>0 



n>0 h<i 
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whence eq p.l3p . As for eq. (|3.14p we consider the functional equation 



, In 

U\X \ X 



U\X^ — X + 1 \ U\X^ — X+1/ \ — X 

n>0 n>0 

Let us denote W[x) = E„>o A(2«, 0) 
Claim. All dominant singularities of X^„>o Cfe(n) z" are dominant singularities of W{z) and Kfe is 
a dominant singularity. 

To prove the Claim we observe that a dominant singularity of 



UlZ 



uiz"^ — z + 1 ■^-^ ' V UlZ^ — z + 1 

n>0 

is cither a singularity of W{z) or ■ Suppose there exists some singularity ( E C which 

is a root of uiz^ — z + 1. By construction C 7^ and C is necessarily a singularity of W{z). 
Suppose Id < Kk, then we arrive at the contradiction |T4^(C)| > W{Kk) since W{() is not finite 
and W{Kk) = J2n>o 0)Pfc" < Therefore all dominant singularities of J2„>o ^fc("') 

dominant singularities of W{z). By Pringsheim's Theorem |25| . X)n>o '^^^ ^ dominant 

positive real singularity which by construction equals pk and the Claim follows. 
The Claim immediately implies that the exponential growth rate is the inverse of the mini- 
mal positive real solution of the equation = Pk- According to [27] the power se- 
ries ^^>Q /fe(2n, 0)z" has an analytic continuation in a Apj.-domain and we have [z"']VF(z) ~ 
Kipk{n){p~^)", where (pk{n) is given by eq. (|2.4p . We can therefore employ Theorem [21 which 
via eq. (j3.13p allows us to transfer the subexponential factors from the asymptotic expressions for 
/fc(2n,0) to Cfe(n). From this eq. (|3.14p follows and the proof of Theorem[3]is complete. □ 



4. PSEUDOKNOT RNA WITH STACK-LENGTH > CT 



In this Section we combine Lemma [T] and Lemma [3] in order to derive the generating function of 
fc-noncrossing RNA pseudoknot structures with minimum stack-size a. It is worth mentioning that 
core-structures are only implicit (via Lemma [3]) in its proof: all expressions and relations are based 
on Tfc^i(n', /i') and Tfc^i(n), respectively. The latter are given by Theorem[l] Our main result reads 
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Theorem 4. Let fc, cr G N, fc > 2, let x he an indeterminant and pk the dominant, positive real 
singularity of J2n>o fki'^'iT'T^)^^^ (^1- H^.S]) ). Then 



b=a-l j=0 ^ ^ 

T,i(n-2/i + 2j + 2,j + l) 



'{h -b)-l 
3 

Furthermore, Tj.^(n), satisfies the following identity 

7i>0 n>0 

where uq — (jjjs^r^jripx ■ Furthermore 
(4.2) T^.^H^V'feH 



holds, where ^ is a dominant singularity o/^^>q T^, ^{n)x" and the minimal positive real solu- 
tion of the equation 



U n\ V (^=)''-^"+l 



'- X 



\(x-')''-x^ + l 



X^ ~ X + 1 



andipk{n) is a polynomial overn derived from the asymptotic expression of fk{2n,0) ~ ^pk{n) 
of eq. 



In the following we present the first 18 numbers of T3_2(?^), T3_3(n), 'TA,2(n) and T4.3(n): 



n 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


13 


14 


15 


16 


17 


18 


T3,2(") 


1 


1 


1 


1 


2 


4 


8 


15 


28 


55 


110 


222 


448 


913 


1890 


3964 


8385 


17846 


T3,3(") 


1 


1 


1 


1 


1 


1 


2 


4 


8 


14 


23 


36 


56 


91 


155 


275 


491 


869 


n 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


13 


14 


15 


16 


17 


18 


T4,2("-) 


1 


1 


1 


1 


2 


4 


8 


15 


28 


55 


110 


223 


455 


944 


1995 


4274 


9244 


20182 


T4,3("-) 


1 


1 


1 


1 


1 


1 


2 


4 


8 


14 


23 


36 


56 


91 


155 


275 


491 


870 
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Proof. The first assertion follows from Lemma [2] and eq. (|3.15p . which allows to express the terms 
Cfe(n — 2b,h — b) via Tk,i{n' , h'). In order to prove eq. (|4.2p we apply Lemma [3] twice. First, 
Lemma |3] implies for arbitrary a and u = 1 

(4.4) ET...W^" = E E C.K/^) (tS)'" 

n>0 n>0/t<f ^ ^ 

and secondly, it guarantees for arbitrary u G C and cr = 1 



(4-5) E E h)u'^x- -EE ( I 



ft 



1-x 



The key observation (the "bridge" ) is here the relation between a and u via the terms ^ and 
■ It is clear that for any cr G N there exists an unique solution uq for 



(4.6) 



1 — 1 — ua;2 



given by uq = (jiyFz^^qri- • This allows to express 



(x-2) 



EEc.K^)^^ 



,2w-l \ '» 



X 



1 



n>0 ?i<|_ 

for any a via the bivariate generating function X)n>o E/i<-| T^, ^^(ri, /i) w''2;" . Now we employ 
Lemma [TJ which provides an interpretation of the latter as follows: 

(4-7) E E ^>^An,h)u^x- ^ ^_^^/,(2n,0) 

n>0/i<f 

We accordingly obtain 



ux- — X + 1 ^ \ ux^ — a; + 1 

n>Oh<-^ n>0 



1-X 



Et.,.w-" = EEc.(">^)(^^) 

EECfc(",/^)fr^^, - . , 

n>Oh<f ^ ' ^ 

-^EA-(2".0)^ 



UnX^ — X + 1 \ UqX^ — X + 1 

n>0 

2n 



and eq. gH) follows. We set V{z) = ^„>o fk{2n, 0) 
Claim. All dominant singularities of X]ri>o "'"fe o-l"")^" ^'^'^ singularities of V{z) and in particular 
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^k,a is a dominant singularity. 

To prove the Claim we observe that a dominant singularity of 



- 2^ /fe(2n,0) 



2n 



n>0 

is cither a singularity of V{z) or ■ Suppose there exists some singularity C € C which is 

a root of ^^^2j_;;^i ■ By construction C 7^ and Q is necessarily a singularity of V{z). Suppose 
Id ^ '^fci then we arrive at the contradiction |V^(C)| > since V{C,) is not finite and 

n>0 

Therefore all dominant singularities of X]n>o '^k cr(")-^" ^'^^ singularities of By Pringsheim's 

Theorem [25], X]n>o "'"fc <t(")-^" ^ dominant positive real singularity which by construction 
equals ^k.a and the Claim follows. 
The equation 



(A fi> V + l ^ 

^ ^ ^ ^ (^-)-- A . — 17 = ^' 

has a minimal positive real solution and the Claim implies that its inverse equals the exponential 
growth-rate. According to [27] the power series X]n>o /fc(2": 0)2" has an analytic continuation in 
a Ap^.-domain and we have [2"]y(z) ~ K(pk{'n){p'^^)"- , where ifk{n) is given by eq. (|2.4p . In view 
of eq. (|4.1[) we can therefore employ Theorem [2l which allows us to transfer the subexponential 
factors from the asymptotic expressions for /fc(2n, 0) to Tfe.cr(n), whence eq. (|4.2p . This completes 
the proof of Theorem [H □ 
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